Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks.
نویسندگان
چکیده
By using an unsupervised cluster analyzer, we have identified a local structural alphabet composed of 16 folding patterns of five consecutive C(alpha) ("protein blocks"). The dependence that exists between successive blocks is explicitly taken into account. A Bayesian approach based on the relation protein block-amino acid propensity is used for prediction and leads to a success rate close to 35%. Sharing sequence windows associated with certain blocks into "sequence families" improves the prediction accuracy by 6%. This prediction accuracy exceeds 75% when keeping the first four predicted protein blocks at each site of the protein. In addition, two different strategies are proposed: the first one defines the number of protein blocks in each site needed for respecting a user-fixed prediction accuracy, and alternatively, the second one defines the different protein sites to be predicted with a user-fixed number of blocks and a chosen accuracy. This last strategy applied to the ubiquitin conjugating enzyme (alpha/beta protein) shows that 91% of the sites may be predicted with a prediction accuracy larger than 77% considering only three blocks per site. The prediction strategies proposed improve our knowledge about sequence-structure dependence and should be very useful in ab initio protein modelling.
منابع مشابه
Bioinformatic analysis of the protein/DNA interface
To investigate the principles driving recognition between proteins and DNA, we analyzed more than thousand crystal structures of protein/DNA complexes. We classified protein and DNA conformations by structural alphabets, protein blocks [de Brevern, Etchebest and Hazout (2000) (Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks. Prots. Struct. Funct. Ge...
متن کاملA Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis
Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...
متن کاملA Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis
Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...
متن کاملLocal backbone structure prediction of proteins
A statistical analysis of the PDB structures has led us to define a new set of small 3D structural prototypes called Protein Blocks (PBs). This structural alphabet includes 16 PBs, each one is defined by the (phi, psi) dihedral angles of 5 consecutive residues. The amino acid distributions observed in sequence windows encompassing these PBs are used to predict by a Bayesian approach the local 3...
متن کاملA theoretical study on quadrupole coupling parameters of HRPII Protein modeled as 310-helix & α-helix structures
A fragment of Histidine rich protein II (HRP II 215-236) was investigated by 14N and 17O electric field gradient, EFG, tensor calculations using DFT. This study is intended to explore the differences between 310-helix and α-helix of HRPII both in the gas phase and in solution. To achieve the aims, the 17O and 14N NQR parameters of a fragment of HRPII (215-236) for both structures are calculated...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proteins
دوره 41 3 شماره
صفحات -
تاریخ انتشار 2000